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^ ! Abstract 



Since Giles introduced the multilevel Monte Carlo path simulation 
method [18j, there has been rapid development of the technique for a 
variety of applications in computational finance. This paper surveys 
^ ■ the progress so far, highlights the key features in achieving a high rate 

■ of multilevel variance convergence, and suggests directions for future 

^ ! research. 



1 Introduction 



In 2001, Heinrich [28], developed a multilevel Monte Carlo method for para- 
metric integration, in which one is interested in estimating the value of E[/(x, A)] 
^ I where x is a finite-dimensional random variable and A is a parameter. In the 

I simplest case in which A is a real variable in the range [0, 1], having estimated 

the value of E[/(a:, 0)] and E[/(a;, 1)], one can use ^(/(x, 0) + /(x, 1)) as a 
control variate when estimating the value of E[/(x, |)], since the variance of 
f{x, |) — 0) + /(x, 1)) will usually be less than the variance of /(x, |). 

This approach can then be applied recursively for other intermediate values of 
A, yielding large savings if /(x. A) is sufficiently smooth with respect to A. 

Giles' multilevel Monte Carlo path simulation [18j is both similar and dif- 
ferent. There is no parametric integration, and the random variable is infinite- 
dimensional, corresponding to a Brownian path in the original paper. However, 
the control variate viewpoint is very similar. A coarse path simulations is used 
as a control variate for a more refined fine path simulation, but since the ex- 
act expectation for the coarse path is not known, this is in turn estimated 
recursively using even coarser path simulation as control variates. The coars- 
est path in the multilevel hierarchy may have only one timestep for the entire 
interval of interest. 
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A similar two- level strategy was developed slightly earlier by Kebaier |31j . 
and a similar multi-level approach was under development at the same time 
by Speight [421 gS]. 

In this review article, we start by introducing the central ideas in multilevel 
Monte Carlo simulation, and the key theorem from [18] which gives the greatly 
improved computational cost if a number of conditions are satisfied. The chal- 
lenge then is to construct numerical methods which satisfy these conditions, 
and we consider this for a range of computational finance applications. 



2 Multilevel Monte Carlo 



2.1 Monte Carlo 

Monte Carlo simulation has become an essential tool in the pricing of deriva- 
tives security and in risk management. In the abstract setting, our goal is to 
numerically approximate the expected value ]E[y], where Y = P{X) is a func- 
tional of a random variable X. In most financial applications we are not able 
to sample X directly and hence, in order to perform Monte Carlo simulations 
we approximate X with X^t such that E[P{XAt)] IE[P(X)], when At 0. 
Using X^t to compute approximation samples produces the standard Monte 
Carlo estimate 



Y 



1 ^ 



AtJ^ 



where is the numerical approximation to X on the ith sample path and 
is the number of independent simulations of X. By standard Monte Carlo 
results Y E[Y], when At — and A^ — )■ oo. In practice we perform Monte 
Carlo simulation with given At > and finite A^ producing an error to the 
approximation of K[Y]. Here we are interested in the mean square error that 
is 



MSE^E 



{Y-E[Y]f 



Our goal in the design of the Monte Carlo algorithm is to estimate Y with 
accuracy root-mean-square error e {MSE < e'^), as efficiently as possible. 
That is to minimize the computational complexity required to achieve the 
desired mean square error. For standard Monte Carlo simulations the mean 
square error can be expressed as 



E 



(Y -E[Y]y 



(Y -E[Y]+E[Y] - E[Y]y 



(F-E[F])^ 



E[Y] - E[Y]f 



Monte Carlo variance bias of the approximation 
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The Monte Carlo variance is proportional to jj 



Y[Y] 



ATS 



V 



N 



i=l 



N 



V[P(XAt)]. 



For both Euler-Maruyama and Milstein approximation |IE[y] — E[y]| = 0{At), 
typically. Hence the mean square error for standard Monte Carlo is given by 



E 



(f - E[Y]f 



0{-) + 0{At'] 



To ensure the root-mean-square error is proportional to e, we must have 
MSE = 0{€'^) and therefore 1/N = 0{e'^) and At^ = 0{e'^), which means 
= 0{e~'^) and At = 0{e). The computational cost of standard Monte Carlo 
is proportional to the number of paths multiplied by the cost of generating 
a path, that is the number of timesteps in each sample path. Therefore, the 
cost is C = 0{e~^). In the next section we will show that using MLMC we 
can reduce the complexity of achieving root mean square error e to O^e^"^). 



2.2 Multilevel Monte Carlo Theorem 

In its most general form, multilevel Monte Carlo (MLMC) simulation uses a 
number of levels of resolution, i = 0,1, . . . , L, with i = being the coarsest, 
and i = L being the finest. In the context of a SDE simulation, level may 
have just one timestep for the whole time interval [0, T], whereas level L might 
have 2^ uniform timesteps At^ = 2~^T. 

If P denotes the payoff (or other output functional of interest), and Pe 
denotes its approximation on level /, then the expected value ]E[Pi] on the 
finest level is equal to the expected value E[Po] on the coarsest level plus a 
sum of corrections which give the difference in expectation between simulations 
on successive levels, 

L 

E[Pl] = E[Po] + $^E[P, - P,_i]. (1) 
e=i 

The idea behind MLMC is to independently estimate each of the expectations 
on the right-hand side of ([1]) in a way which minimises the overall variance 
for a given computational cost. Let Yq be an estimator for E[Po] using A^q 
samples, and let Yi, ^ > 0, be an estimator for E[P^ — P^_i] using A^^ samples. 
The simplest estimator is a mean of A'^ independent samples, which for £ > 
is 

Y, = Ni'Y.^Pi~Pl,). (2) 

1=1 

The key point here is that P/ — P/_i should come from two discrete approxi- 
mations for the same underlying stochastic sample (see [39]), so that on finer 
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levels of resolution the difference is small (due to strong convergence) and so 
its variance is also small. Hence very few samples will be required on finer 
levels to accurately estimate the expected value. 
The combined MLMC estimator Y is 

1=0 

We can observe that 

1=1 

and 

L L 

E[Y] = = E[Po] + ^E[P, - P,_i] = E[Pl]. 

e=o e=i 
Although we are using different levels with different discretization errors to 
estimate ]E[P], the final accuracy depends on the accuracy of the finest level 
L. 

Here we recall the Theorem from |l8j (which is a slight generalisation of 
the original theorem in [18]) which gives the complexity of MLMC estimation. 

Theorem 2.1. Let P denote a functional of the solution of a stochastic differ- 
ential equation, and let Pe denote the corresponding level i numerical approx- 
imation. If there exist independent estimators Yg based on Ni Monte Carlo 
samples, and positive constants a, /3, 7, Ci, C2, C3 such that a > | min(/3, 7) and 

i) |E[P,-P]|<ci2-"^ 



It) E[F,] 



E[Po], ^ = 

E[P,-P,_i], £>0 

III) Y[Ye\ < C2Ni'2-'^^ 

iv) Ce < C3 A^£2'^^, where Ce is the computational complexity ofYi 

then there exists a positive constant C4 such that for any e < there are 
values L and for which the multilevel estimator 

£=0 

has a mean- square- error with bound 

MSE = E[{Y - E[P])^] < 
with a computational complexity C with bound 

/3>7, 

C < 
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2.3 Improved MLMC 



In the previous section we showed that the key step in MLMC analysis is the 
estimation of variance ^[Pl — Pl-i]- As it will become more clear in the next 
section, this is related to the strong convergence results on approximations of 
SDEs, which differentiates MLMC from standard MC, where we only require 
a weak error bound for approximations of SDEs. 

We will demonstrate that in fact the classical strong convergence may not 
be necessary for a good MLMC variance. In ([2]) we have used the same esti- 
mator for the payoff Pe on every level i, and therefore ([T]) is a trivial identity 
due to the telescoping summation. However, in [17] Giles demonstrated that 
it can be better to use different estimators for the finer and coarser of the two 
levels being considered, P/ when level i is the finer level, and P[ when level i 
is the coarser level. In this case, we require that 

E[P/]=E[P;] for £ = 1,...,L, (3) 

so that 

L 

E[p[] = E[p^] + j2npl-pu- 

t=\ 

The MLMC Theorem is still applicable to this modified estimator. The ad- 
vantage is that it gives the flexibility to construct approximations for which 
P/ - Pti is much smaller than the original P, - P,_„ giving a larger value for 
/3, the rate of variance convergence in condition iii) in the theorem. In the next 
sections we demonstrate how suitable choices of P/ and P^ can dramatically 
increase the convergence of the variance of the MLMC estimator. 

The good choice of estimators, as we shall see, often follows from analysis 
of the problem under consideration from the distributional point of view. We 
will demonstrate that methods that had been used previously to improve the 
weak order of convergence can also improve the order of convergence of the 
MLMC variance. 

2.4 SDEs 

First, we consider a general class of rf-dimensional SDEs driven by Brownian 
motion. These are the primary object of studies in mathematical finance. In 
subsequent sections we demonstrate extensions of MLMC beyond the Brown- 
ian setting. 

Let (n, J^, {J^i}t>o, P) be a complete probability space with a filtration 
{'^t}t>o satisfying the usual conditions, and let be a m-dimensional Brow- 
nian motion defined on the probability space. We consider the numerical ap- 
proximation of SDEs of the form 

da;(t) = /(x(t)) dt + g{x{t)) dw(t), (4) 
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where x{t) G R"^ for each t > 0, / G C2(Mrf,M'^), g g C^{R'^,R'^''"'), and 
for simphcity we assume a fixed initial value Xq G M'^. The most prominent 
example of SDEs in finance is a geometric Brownian motion 

dx{t) = ax{t) dt + I3x{t) dw{t), 

where > 0. Although, we can solve this equation explicitly it is still 
worthwhile to approximate its solution numerically in order to judge the per- 
formance of the numerical procedure we wish to apply to more complex prob- 
lems. Another interesting example is the famous Heston stochastic volatility 
model 

{ds{t) = rs{t) dt + s{t)y^v{t)dwi{t) 
dv{t) = k{9 -v{t))dt + cr^/vit)dw2{t) (5) 
dwi dw2 = pdt, 

where r,K,,6,a > 0. In this case we do not know the explicit form of the 
solution and therefore numerical integration is essential in order to price certain 
financial derivatives using the Monte Carlo method. At this point we would 
like to point out that the Heston model (|5]) does not satisfy standard conditions 
required for numerical approximations to converge. Nevertheless, in this paper 
we always assume that coefficients of SDEs (jl]) are sufficiently smooth. We 
refer to [32l |35l HI] for an overview of the methods that can be applied when 
the global Lipschitz condition does not hold. We also refer the reader to [33] 
for an application of MLMC to the SDEs with additive fractional noise. 

2.5 Euler and Milstein discretisations 

The simplest approximation of SDEs @ is an Euler-Maruyama (EM) scheme. 
Given any step size Ate, "we define the partition VAt^ '■= {nAte : n = 0, 1, 2, 2^} 
of the time interval [0,T], 2^ At = T > 0. The EM approximation ^ 
x{nAtg) has the form [31] 

Xi^, = Xi + f{Xl) At, + g{Xl:) Awi^„ (6) 

where Aw^^-^ = w{{n + l)Ati) — w{nAti) and Xq = xq. Equation (E]) is written 
in a vector form and its i^^ component reads as 

m 

XUi = + /,(X^) At, + 9^AK) A<n+i- 

i=i 

In the classical Monte Carlo setting we are mainly interested in the weak 
approximation of SDEs dl]). Given a smooth payoff P : — )■ M we say that 
X^i converges to x{T) in a weak sense with order a if 

|E[P(x(T))]-E[P(X^)]|=0(At?). 
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Rate a is required in condition [%) of Theorem 12.11 However, for MLMC 
condition iiii) of Theorem 12. II is crucial. We have 

V, = Var (P,-P,_i) < E {{P,-P,_^f\ , 

and 

E [(P,-P,_i)^] < 2E[(P,-P)2] +2E[(P-P,_i)2] . 
For Lipschitz continuous payoffs, (P(x) — P{y)y < L \\x — y\f , we then have 

E [(P^-P)2] < LE [||x(T)-X^ 

It is clear now, that in order to estimate the variance of the MLMC we need 
to examine strong convergence property. The classical strong convergence on 
the finite time interval [0, T] is defined as 

{E[\\x{T)-X^\\''])^^''' = 0{A4), for p > 2. 

For the EM scheme ^ = 0.5. In order to deal with path dependent options we 
often require measure the error in the supremum norm: 

E[ sup ||a;(nAt^) -X^||M = C(At«) for p>2. 

0<n<2'^ J 

Even in the case of globally Lipschitz continuous payoff P, the EM does not 
achieve /3 = 2^ > 1 which is optimal in Theorem (12. ip . In order to improve the 
convergence of the MLMC variance the Milstein approximation Xn ~ x{n Ati) 
is considered, with i^^ component of the form 



n+1 

(7) 

+ h,,k{Xi) {Awl^Awl^ - AU - A%^n) 
j,k=i 

where fl is the correlation matrix for the driving Brownian paths, and „ is 
the Levy area defined as 

l'{n+l)Ate 

= / ( iwj{t)-Wj{nAte)) dwkit) - {wk{t)-Wk{nAtt)) dwj{t)) . 

JnAtc 

The rate of strong convergence ^ for the Milstein scheme is double the value 
we have for the EM scheme and therefore the MLMC variance for Lipschitz 
payoffs converges twice as fast. However, this gain does not come without 
a price. There is no efficient method to simulate Levy areas, apart from 
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dimension 2 [TU HH US]. In some applications, the diffusion coefficient g{x) 
satisfies a commutativity property which gives 

hijk{x) = hikj{x) for all z,j, fc. 

In that case, because the Levy areas are anti-symmetric (i.e. A'-jk^n — ~^fcj>); 
it follows that hijk{X!^) A'--^,^ + hikj{X^) A'-f^-,^ = and therefore the terms in- 
volving the Levy areas cancel and so it is not necessary to simulate them. 
However, this only happens in special cases. Clark & Cameron proved for 
a particular SDE that it is impossible to achieve a better order of strong con- 
vergence than the Euler-Maruyama discretisation when using just the discrete 
increments of the underlying Brownian motion. The analysis was extended 
by Miiller-Gronbach [38j to general SDKs. As a consequence if we use the 
standard MLMC method with the Milstein scheme without simulating the 
Levy areas the complexity will remain the same as for Euler-Maruyama. Nev- 
ertheless, Giles and Szpruch showed in [22] that by constructing a suitable 
antithetic estimator one can neglect the Levy areas and still obtain a multi- 
level correction estimator with a variance which decays at the same rate as 
the scalar Milstein estimator. 



2.6 MLMC algorithm 

Here we explain how to implement the Monte Carlo algorithm. Let us recall 
that the MLMC estimator Y is given by 

Y = ±Y,. 

£=0 

We aim to minimize the computational cost necessary to achieve desirable 
accuracy e. As for standard Monte Carlo we have 

E [{Y - E[P(X)])2] =e\{Y - E[Y]y] + (E[Pl] - E[P(X)])2) . 



Monte Carlo variance 

The variance is given by 



bias of the approximation 



Y[Y] = j2Wi] = i2w^'' 

where Ve = V[P^— P^_i]. To minimize the variance of Y for fixed computational 
cost C = "^£=0 NgAtJ^ , we can treat Ni as continuous variable and use the 
Lagrange function to find the minimum of 



1=0 ^ \e=o / 
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First order conditions shows that Ni 



A 2 y/ViAti, therefore 




Since we want V[y] < ^ we can show that 




thus the optimal number of samples for level i is 



L 
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Assuming 0{A.ti) weak convergence, the bias of the overall method is equal 
cAIl = cT2~^. If we want the bias to be proportional to -4^ we set 



From here we can calculate the overall complexity. We can now outline the 
algorithm 

1. Begin with L=0; 

2. Calculate the initial estimate of Vl using 100 samples. 

3. Determine optimal using ([S]). 

4. Generate additional samples as needed for new A^^. 

5. if L < Lmax set L := L + 1 and go to 2. 

Most numerical tests suggests that Lmax is not optimal and we can sub- 
stantially improve MLMC by determining optimal L by looking at bias. For 
more details see [T8] . 



A key application of MLMC is to compute the expected payoff of financial 
options. We have demonstrated that for globally Lipschitz European payoffs, 
convergence of the MLMC variance is determined by the strong rate of con- 
vergence of the corresponding numerical scheme. However, in many financial 
applications payoffs are not smooth or are path-dependent. The aim of this 



L 



max 



log(e/(cTv/2))-^ 
log 2 



3 Pricing with MLMC 
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section is to overview results on mean square convergence rates for Euler- 
Maruyama and Milstein approximations with more complex payoffs. In the 
case of EM, the majority of payoffs encountered in practice have been analyzed 
in Giles et al. [20]. Extension of this analysis to the Milstein scheme is far from 
obvious. This is due to the fact that Milstein scheme gives an improved rate 
of convergence on the grid points, but this is insufficient for path dependent 
options. In many applications the behaviour of the numerical approximation 
between grid points is crucial. The analysis of Milstein scheme for complex 
payoffs was carried out in [H]. To understand this problem better, we recall a 
few facts from the theory of strong convergence of numerical approximations. 
We can define a piecewise linear interpolation of a numerical approximation 
within the time interval [nAt^, (n + l)At£) as 

X'{t)=Xi + Xe{X'^^,-Xi), for t e [nAU, {n + l)At,) (9) 

where = {t — nAti)/At£. Miiller-Gronbach [37] has show that for the 
Milstein scheme ([9]) we have 

E[sup ||a;(t)-X^(t)f] =0(1 At,log(At,) p>2, (10) 

0<t<T 

that is the same as for the EM scheme. In order to maintain the strong order of 
convergence we use Brownian Bridge interpolation rather than basic piecewise 
linear interpolation: 

X'it) = Xi + XeiXi^,-Xi) + giXi) {wit)-winAte)-\Awi^^,), (11) 

for t G [nAti, (n + l)At^). For the Milstein scheme interpolated with Brownian 
bridges we have [37j 

E[ sup x{t)-X\t) "] = 0{\ Atelog{Ati) 

0<t<T 

Clearly X^{t) is not implementable, since in order to construct it, the knowl- 
edge of the whole trajectory {w(t))o<t<T is required. However, we will demon- 
strate that combining X^{t) with conditional Monte Carlo techniques can dra- 
matically improve the convergence of the variance of the MLMC estimator. 
This is due to the fact that for suitable MLMC estimators only distributional 
knowledge of certain functionals of {w{t))o<t<T will be required. 



3.1 Euler-Maruyama scheme 

In this section we demonstrate how to approximate the most common payoffs 
using the EM scheme 

The Asian option we consider has the payoff 

p = (t^ x{t) dt-K 
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Using the piecewise linear interpolation (Q one can obtain the following ap- 
proximation 

Pi ^ / X\t) dt = T-^Y.\ iK+XUi), 

•^0 n=0 

Lookback options have payoffs of the form 

P = x(T) - inf x(t). 

0<t<T 

A numerical approximation to this payoff is 

P,^X^- inf 

0<n<2^ 

For both of these payoffs it can be proved that Vi = 0{Ati) [20j. 

We now consider a digital option, which pays one unit if the asset at the 
final time exceeds the fixed strike price K, and pays zero otherwise. Thus, the 
discontinuous payoff function has the form 

P = '^{x{T)>K}, 

with the corresponding EM value 

Pe = '^{x^>K}- 

Assuming boundedness of the density of the solution to (jlj) in the neighborhood 

1/2 S 

of the strike K, it has been proved in |20] that = o{At/ ), for any 
6 > 0. This result has been tightened by Avikainen [3] who proved that 
Ve = 0{Aty^logAte). 

An up-and-out call gives a European payoff if the asset never exceeds the 
barrier, B, otherwise it pays zero. So, for the exact solution we have 

P = (xiT) - i^) + l{supo<,<^:.(t)<iJ}, 

and for the EM approximation 

A down-and-in call knocks in when the minimum asset price dips below the 
barrier B, so that 

P = i^iT) - i^) + l{info<t<T^'W<B}' 

and, accordingly. 

Pi = {X^ - K)+l{i^i^^^^^^xi<B}- 

For both of these barrier options we have = o{At/ ), for any 5 > 0, 
assuming that info<t<T x(t) and supg<(<j. x(t) have bounded density in the 
neighborhood of B [20] . 

As summarised in Table [H numerical results taken form [17J suggest that 
all of these results are near-optimal. 
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Euler 


option 


numerical 


analysis 


Lipschitz 


0{At,) 


0{Ate) 


Asian 


0{AU) 


OiAte) 


lookback 


0{AU) 


0{AU) 


barrier 




o{At]'''') 


digital 




0{At\'^ log At,) 



Table 1: Orders of convergence for as observed numerically and proved ana- 
lytically for both Euler discretisations; 5 can be any strictly positive constant. 

3.2 Milstein scheme 

In the scalar case of SDEs (j4]) (that is with d = m = 1) the Milstein scheme 
has the form 

Xi,^, = Xl + f{xt)At, + g{X'^)AwU,+g\xl,)g{X'M^wl^i?-At,), (12) 

where g' = dg/dx. The analysis of Lipschitz European payoffs and Asian 
options with Milstein scheme is analogous to EM scheme and it has been 
proved in jjlj that in both these cases = 0{Atj). 

3.2.1 Lookback options 

For clarity of the exposition we will express the fine time-step approximation 
in terms of the coarse time-step, that is V At^ := {nAti^i : n = 0, |, 1, 1 + 
i, 2, 2^^^}. The partition for the coarse approximation is given by Vai^.i '■= 
{nAti_i : n = 0, 1, 2, 2^~^}. Therefore, X!^~^ corresponds to for n = 
0,1, 2, ...,2^-1. 

For pricing lookback options with the EM scheme, as an approximation of the 
minimum of the process we have simply taken min^X^. This approximation 
could be improved by taking 

Xi„ = mm (X^ - P*giXi)Aty') . 

-1 /o 

Here (3* ~ 0.5826 is a constant which corrects the 0{At/ ) leading order error 
due to the discrete sampling of the path, and thereby restores 0{Ate) weak 
convergence pj. However, using this approximation, the difference between 

1 /2 

the computed minimum values and the fine and coarse paths is 0{At/ ), and 
hence the variance is (9 (At,), corresponding to /3 = 1. In the previous 
section, this was acceptable because /3 = 1 was the best that could be achieved 
in general with the Euler path discretisation which was used, but we now aim 
to achieve an improved convergence rate using the Milstein scheme. 
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In order to improve the convergence, the Brownian Bridge interpolant X^{t) 
defined in ( ITT]) is used. We have 

min X^{t) = min \ min X^{t)] 

0<t<T o<n<2'^-l-i nAti_i<t<(n+i)At,_i 

min ^n,mm5 
0<n<2'^-i-i 

where minimum of the fine approximation over the first half of the coarse 
time-step is given by 



n,min 2 I " n+- 



{K+r^nf -2g{XiyAti\ogU^y (13) 



and minimum of the fine approximation over the second half of the coarse 
time-step is given by 



(14) 

where Ui,U^ i are uniform random variables on the unit interval. For the 

coarse path, in order to improve the MLMC variance a slightly different es- 
timator is used, see ([3]). Using the same Brownian increments as we used on 
the fine path (to guarantee that we stay on the same path), equation (fTTl) is 
used to define = X^~^{{n + |)At£_i). Given this interpolated value, the 

minimum value over the interval [?2At^_i, (n + l)At^_i] can then be taken to 
be the smaller of the minima for the two intervals [nAt^_i, (n + |)At^_i) and 
[(n + i)At,_i,(n+l)At^_i), 



K:Ln = \[Xn'+Kl\-^{KH-^'-'') -2(^(X^l))2%ilogt/^ 



Note that g{X^~^) is used for both time steps. It is because we used the Brow- 
nian Bridge with diffusion term g{X^~^) to derive both minima. If we changed 
g{X^~^) to g{X^i) in X^^ this would mean that different Brownian 

Bridges were used on the first and second half of the coarse time-step and as a 
consequence condition ([3]) would be violated. Note also the re-use of the same 
uniform random numbers U!^ and U^.i used to compute the fine path mini- 

mum. The min(X^^^„, X^\ ^.^) has exactly the same distribution as X^~^-^, 

since they are both based on the same Brownian interpolation, and therefore 
equality ([3]) is satisfied. Giles et al. [11] proved the following Theorem: 
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Theorem 3.1. The multilevel approximation for a lookback option which is a 
uniform Lipschitz function of x{T) and inf[o,r]x(t) has V/ = ©(At^^"^) for any 
5>0. 

3.3 Conditional Monte Carlo 

Giles [13 and Giles et al. |Ilj have shown that combining conditional Monte 
Carlo with MLMC results in superior estimators for various financial payoffs. 

To obtain an improvement in the convergence of the MLMC variance bar- 
rier and digital options, conditional Monte Carlo methods is employed. We 
briefly describe it here. Our goal is to calculate E[P]. Instead, we can write 

E[P] = E[E[P I Z]], 

where Z is a random vector. Hence E[P | Z] is an unbiased estimator of E[P]. 
We also have 

Var [P] = E[Var [P | Z]] + Var [E[P | Z]] , 

hence Var [E[P | Z]] < Var(P). In the context of MLMC we obtain a 
better variance convergence if we condition on different vectors on the fine 
and the coarse level. That is on the fine level we take E[P-^ | Z-^], where 
Z-^ = {X^}o<n<2'^- Oil the coarse level instead of taking E[P'^ | Z'^] with 
Z' = {X^i}o<„"<2^-i, we take E[P^ | Z',Z% where Z' = {Xl~\}o<n<2^-i are 

obtained from equation (fTTj) . Condition (|3]) trivially holds by tower property 
of conditional expectation 

E [E[P^ I Z"]] = E[P^] = E 

3.4 Barrier options 

The barrier option which is considered is a down-and-out option for which 
the payoff is a Lipschitz function of the value of the underlying at maturity, 
provided the underlying has never dropped below a value P G M, 

P = /(X(T)) 1{,^T}. 

The crossing time r is defined as 

r = inf {x{t) < B} . 

This requires the simulation of (x(T), l^>r)). The simplest method sets 

r^*^ = inf {Xi < B} 

n 

and as an approximation takes (X2f_i, Ij^At^^g'^-i})- But even if we could 
simulate the process {a;(nAt^)}o<„<2*-i it is possible for {x(t)}o<t<T to cross 



E[P^ I Z"] 
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the barrier between grid points. Using the Brownian Bridge interpolation we 
can approximate 1{t>t} by 



.t-i_i 

2 

n 1 

n=0 



. >B}- 



This suggests following the lookback approximation in computing the mini- 
mum of both the fine and coarse paths. However, the variance would be larger 
in this case because the payoff is a discontinuous function of the minimum. 
A better treatment, which is the one used in is to use the conditional 
Monte Carlo approach to further smooth the payoff. Since the process is 
Markovian we have 



^ 2 



E 



E 



n=0 

nt-l 1 
^ 2 

n=0 

nt-l 1 

r ^2 

L n=0 
2^-1 — 1 

fix',.-.) Uii-Pi 



E 



n=0 



where from E 



K=P| inf ^ X(t)<5|X^,X^^, 



exp 



-2(X^-i?) + (X^^,-fi)- 



and 



pi_,,=F{ inf X(t)<i?|X^^„X^+i 

' ' in+^)Ate<t<{n+l)Ate 



-2{Xl^,-By{Xi^,-B) 



exp 



Hence, for the fine path this gives 



pi=f{x',.-.) n (1 



(16) 



?i=0 
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The payoff for tlie coarse patli is defined similarly. However, in order to reduce 
the variance, we subsample -^^~\, as we did for lookback options, from the 



Brownian Bridge connecting ^ and X^_^\ 

-{X^-i. >B}\ 



2'^-i-l 

n >B}] 

n=0 

2^-1-1 



E 



E 



E 



I- n,min — J 



n=0 



'1 ) • • • ) 1 ) ^2^~lJ 

2 2 



n=0 



/(^2^--o n (^-pV)(^ 



P2,n J 



n=0 



where 



and 



Pl,n 



-exp 



-2(x^;\-i?)+(x^;l-5)- 



Note that the same g{X^ ^) is used (rather than using g{X'^_^\) in P2,n ) to 

calculate both probabilities for the same reason as we did for lookback options. 
The final estimator can be written as 



Pll = f{X'--\) n (1 - Pin){^ - pin)- 



(17) 



n=0 



Giles et al. pi] proved the following theorem 

Theorem 3.2. Provided inf[o,T] |fi'(-B)| > 0, and inf p^T] 3;(t) has a hounded 
density in the neighbourhood of B, then the multilevel estimator for a down- 
and-out harrier option has variance Ve = o(At^'^^ ^) for any 6>0. 

The reason the variance is approximately o{At^/'^~^) instead of 0{At'j) is 
the following: due to the strong convergence property the probability of the 
numerical approximation being outside At^~''-neighbourhood of the solution 
to the SDE (jl]) is arbitrary small, that is for any e > 



P 



sup 

0<nAte<T 



\x(nAtf 



XlW > At 



l-e 



< At 



-p+pe 



E 



sup 

0<nAtf<T 



IxinAtf) -X, 



e\\p 



18) 



0(Af). 
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If inf[o,T] x{t) is outside the At/ -neighborhood of the barrier B then by (1181) 
it is shown that so are numerical approximations. The probabilities of crossing 
the barrier in that case are asymptotically either or 1 and essentially we are in 

1 /2 

the Lipschitz payoff case. If the inf [o,t] x(t) is within the At/ -neighbourhood 
of the barrier B then so are the numerical approximations. In that case it can 
be shown that E[(P/ - P/_i)^] = 0{At^-^) but due to the bounded density 

1/2 

assumption, the probability that inf[o^T] x{t) is within At/ -neighbourhood of 
the barrier B is of order At/ . Therefore the overall MLMC variance is 
Ve = a{Af^'^) for any 6>0. 



3.5 Digital options 

A digital option has a payoff which is a discontinuous function of the value of 
the underlying asset at maturity, the simplest example being 

P = l{x.(T)>B}- 

Approximating l{x'(r)>_B} based only on simulations of x{T) by Milstein scheme 
will lead to an 0{Atf) fraction of the paths having coarse and fine path ap- 
proximations to x(T) on either side of the strike, producing Pg — Pg^i = ±1, 
resulting in = 0{Ate). To improve the variance to 0{At/ ) for all 5>0, 
the conditional Monte Carlo method is used to smooth the payoff (see section 
7.2.3 in [24]). This approach was proved to be successful in Giles et al. [TT] 
and was tested numerically in |16j. 

If X^f.i i denotes the value of the fine path approximation one time-step 
before maturity, then the motion thereafter is approximated as Brownian mo- 
tion with constant drift /(-^2<^_i_i) and volatility 5'(-^2'?_i_ i )• The conditional 

expectation for the payoff is the probability that X^f_i > B after one further 
time-step, which is 

' X^i_^_i+f{X^i_^_i)Ate - B' 

itUqZZJUM; ^^^^ 



E 



X' 



where $ is the cumulative Normal distribution. 

For the coarse path, we note that given the Brownian increment Awij}^ i 

^ 2 

for the first half of the last coarse time-step (which comes from the fine path 
simulation), the probability that X^i^^ >B is 

Pi-i =E I X!/[\_^, Aw^-\ 

-f{X'^i\_/)At,^,+g{X'^i\_/)Aw[-\_, - B\ (20) 




1-1 



The conditional expectation of ( l20l) is equal to the conditional expectation of 
P/-i defined by (fT9|) on level i—i, and so equality ([3]) is satisfied. A bound on 
the variance of the multilevel estimator is given by the following result: 
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Theorem 3.3. Provided g{B) ^ 0, and x{t) has a hounded density in the 
neighbourhood of B, then the multilevel estimator for a digital option has vari- 
ance Vi = o(Atf^^ ^) for any 6>0. 

Results of the above section were tested numerically in [17] and are sum- 
marized in the table |2l 



option 


Mib 
numerical 


jtein 
analysis 


Lipschitz 

Asian 

lookback 

barrier 

digital 


0{Atj) 

0{Atf) 

0{Atj) 

0{At]'^) 

0{At]'^) 


0{At}) 
0{Atj) 

o{At]-') 

oiAi^/'-') 

o{Atf-') 



Table 2: Orders of convergence for Vi as observed numerically and proved 
analytically for Milstein discretisations; 6 can be any strictly positive constant. 



4 Greeks with MLMC 

Accurate calculation of prices is only one objective of Monte Carlo simulations. 
Even more important in some ways is the calculation of the sensitivities of the 
prices to various input parameters. These sensitivities, known collectively as 
the "Greeks" , are important for risk analysis and mitigation through hedging. 

Here we follow the results by Burgos at al. [8] to present how MLMC 
can applied in this setting. The pathwise sensitivity approach (also known 
as Infinitesimal Perturbation Analysis) is one of the standard techniques for 
computing these sensitivities [23]. However, the pathwise approach is not ap- 
plicable when the financial payoff function is discontinuous. One solution to 
these problems is to use the Likelihood Ratio Method (LRM) but its weak- 
nesses are that the variance of the resulting estimator is usually 0{At^^). 

Three techniques are presented that improve MLMC variance: payoff smooth- 
ing using conditional expectations [24j; an approximation of the above tech- 
nique using path splitting for the final timestep [2]; the use of a hybrid com- 
bination of pathwise sensitivity and the Likelihood Ratio Method [19j. We 
discuss the strengths and weaknesses of these alternatives in different multi- 
level Monte Carlo settings. 

4.1 Monte Carlo Greeks 

Consider the approximate solution of the general SDE (jl]) using Euler discreti- 
sation iQ. The Brownian increments can be defined to be a linear transfor- 
mation of a vector of independent unit Normal random variables Z. 
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The goal is to efficiently estimate the expected value of some financial payoff 
function P{x{T)), and numerous first order sensitivities of this value with 
respect to different input parameters such as the volatility or one component 
of the initial data a;(0). In more general cases P might also depend on the 
values of process {x(t)}Q<t<T at intermediate times. 

The pathwise sensitivity approach can be viewed as starting with the ex- 
pectation expressed as an integral with respect to Z: 



Ve^E [PiX'^iZ,9))] = J P{Xl{Z,e)) pz{Z) dZ. 



(21) 



Here 9 represents a generic input parameter, and the probability density func- 
tion for Z is 

p^(Z) = (27r)-'^/2exp(-||Z||2/2), 

where d is the dimension of the vector Z. 

Let X^ = X!^{Z,9). If the drift, volatility and payoff functions are all 
different iable, f l?T]) may be differentiated to give 



dV, _ f dPjXj) dxi 



dX^ 

with " being obtained by differentiating (E]) to obtain 

dxU = ^ , ( dfjxle) dxl djVC9)\ 

de de \ dx^ de de ; ' 

( dg{xi,e) dxi dgixi,e) \ 

V 9xi de de J 



(23) 



I 

n+l- 



We assume that Z — y Aif^^]^ mapping does not depend on e. It can be proved 
that ( 122|) remains valid (that is we can interchange integration and differentia- 
tion) when the payoff function is continuous and piecewise differentiable, and 
the numerical estimate obtained by standard Monte Carlo with M independent 
path simulations 

9P(X^'™) dXi'"" 



dxi de 



is an unbiased estimate for dV/de with a variance which is 0{M ^), if P{x) is 
Lipschitz and the drift and volatility functions satisfy the standard conditions 
i34j. 

Performing a change of variables, the expectation can also be expressed as 
Vi^E [P{Xi)] = j P{x) pxiix, e)dx, (24) 

where px' {x, e) is the probability density function for X^ which will depend 
on all of the inputs parameters. Since probability density functions are usually 
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smooth, can be differentiated to give 

de ^ ' dd / ' ' dd 



' ' de 

which can be estimated using the unbiased Monte Carlo estimator 

m=l 

This is the Likehhood Ratio Method. Its great advantage is that it does not 
require the differentiation of P(X^). This makes it apphcable to cases in 
which the payoff is discontinuous, and it also simplifies the practical imple- 
mentation because banks often have complicated flexible procedures through 
which traders specify payoffs. However, it does have a number of limitations, 
one being a requirement of absolute continuity which is not satisfied in a few 
important applications such as the LIBOR market model 



4.2 Multilevel Monte Carlo Greeks 

The MLMC method for calculating Greeks can be written as 

dV _ dEjP) dEjPL) _ dEjPo) ^ dEjPf - Pti) . . 

09 09 ^ 09 09 ^ 09 ' ^ ^ 

Therefore extending Monte Carlo Greeks to MLMC Greeks is straightforward. 
However, the challenge is to keep the MLMC variance small. This can be 
achieved by appropriate smoothing of the payoff function. The techniques 
that were presented in section W!l\ are also very useful here. 

4.3 European call 

As an example we consider an European call P = {x{T) —B)^ with x{t) being 
a geometric Brownian motion with Milstein scheme approximation given by 

XUi = K + rX^At, + aX^A<, + y ((A^^^,)2 - At,). (26) 

We illustrate the techniques by computing delta (5) and vega (z/), the sensi- 
tivities to the asset's initial value x(0) and to its volatility a. 

Since the payoff is Lipschitz, we can use pathwise sensitivities. We observe 
that 

d , f 0, for X < 5 

— X — B] 



Ox^ ' \ 1, for X > 5 

This derivative fails to exists when x = B, but since this event has probability 
0, we may write 



Ox ' 

Therefore we are essentially dealing with a digital option. 



{X-K)+ = 1{X>B}- 
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4.4 Conditional Monte Carlo for Pathwise Sensitivity 

Using conditional expectation the payoff can be smootli as we did it in Section 
13.21 European calls can be treated in the exactly the same way as Digital 
option in Section 13.21 that is instead of simulating the whole path, we stop at 
the penultimate step and then on the last step we consider the full distribution 
of (X^, I 

For digital options this approach leads to f lT9|) and fl20|) . For the call options 
we can do analogous calculations. In [8] numerical results for this approach 
obtained, with scalar Milstein scheme used to obtain the penultimate step. 
They results are presented in Table [31 For lookback options conditional ex- 
pectations leads to (fT3|l and ( !T5|) and for barriers to ( [T6|) and ( ITT]) . Burgos 
et al [8], applied pathwise sensitivity to these smoothed payoffs, with scalar 
Milstein scheme used to obtain the penultimate step, and obtained numerical 
results that we present in Table ID 





Call 


Digital 


Estimator 




MLMC Complexity 


/3 


MLMC Complexity 


Value 


^ 2.0 




^ 1.4 




Delta 


^ 1.5 




^ 0.5 


C(e-2-5) 


Vega 


^ 2 


0{e-^) 


^ 0.6 


0(e-2-4) 



Table 3: Orders of convergence for Ve as observed numerically and correspond- 
ing MLMC complexity. 





Lookback 


Barrier 


Estimator 


/3 


MLMC Complexity 


/3 


MLMC Complexity 


Value 


^ 1.9 




^ 1.6 




Delta 


^ 1.9 




^ 0.6 




Vega 


^ 1.3 


0{e-') 


^ 0.6 





Table 4: Orders of convergence for Ve as observed numerically and correspond- 
ing MLMC complexity. 



4.5 Split pathwise sensitivities 

There are two difficulties in using conditional expectation to smooth pay- 
offs in practice in financial applications. This first is that conditional ex- 
pectation will often become a multi-dimensional integral without an obvious 
closed-form value, and the second is that it requires a change to the of- 
ten complex software framework used to specify payoffs. As a remedy for 
these problems the splitting technique to approximate E [P(X^i) \ X!^i_-^ and 
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E 



is used. We get numerical estimates of these 
ues by "splitting" every simulated path on the final timestep. At the fine 
level: for every simulated path, a set of s final increments {Ary2<'*}iG[i,s] is 
simulated, which can be averaged to get 



(27) 



1=1 



At the coarse level, similar to the case of digital options, the fine increment of 
the Brownian motion over the first half of the coarse timestep is used. 



E 



P{X. 



e-i 



2i-i J I ^2^-1-1' 2^-2 



2<;-2' ^^2*^-1 J 



(28) 



i=l 



This approach was tested in [8] , with scalar the Milstein scheme used to obtain 
the penultimate step, and is presented in Table [51 As expected the values of 
/3 tend to the rates offered by conditional expectations as s increases and the 
approximation gets more precise. 



Estimator 


s 


(3 


MLMC Complexity 


Value 


10 


^ 2.0 


0(e-2) 




500 


^ 2.0 


0(e-2) 


Delta 


10 


^ 1.0 


0(e-^(log6)^) 




500 


^ 1.5 


0(e-2) 


Vega 


10 


^ 1.6 


0(e-2) 




500 


^ 2.0 


0(e-2) 



Table 5: Orders of convergence for Vi as observed numerically and the corre- 
sponding MLMC complexity. 



4.6 Optimal number of samples 

The use of multiple samples to estimate the value of the conditional expecta- 
tions is an example of the splitting technique p]. If w and z are independent 
random variables, then for any function P{w, z) the estimator 



M 



m=l \ 1=1 

with independent samples and 2;"^'* is an unbiased estimator for 

E^,, \P{w,z)] = ^u, \E,[P{w,z)\w 

and its variance is 

N[YmA = [E,[P{w, z) I w]] + {MSy^ E^ [V,[P( 



W,Z)\W 
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The cost of computing Ym,s with variance viM ^ -\-V2 {MS) ^ is proportional 
to 

C1M + C2 MS, 

with Ci corresponding to the path calculation and C2 corresponding to the pay- 
off evaluation. For a fixed computational cost, the variance can be minimised 
by minimising the product 

{Vi+V2 S~^) (C1 + C2 s) = ■Ui C2 s + Vi Ci + ■i;2 C2 + V2 Ci s~^, 

which gives the optimum value Sopt = \/i'2 Ci/vi C2. 

Ci is 0{AtJ^) since the cost is proportional to the number of timesteps, 
and C2 is 0{1), independent of At^. If the payoff is Lipschitz, then vi and V2 
are both 0{1) and Sopt^O{AtJ^^'^). 



4.7 Vibrato Monte Carlo 

The idea of vibrato Monte Carlo is to combine pathwise sensitivity and Like- 
lihood Ration Method. Adopting the conditional expectation approach, each 
path simulation for a particular set of Brownian motion increments = 
{Awl, ^""^2' • • • ' ^''^2^_i) (excluding the increment for the final timcstcp) com- 
putes a conditional Gaussian probability distribution px{X^^\w^). For a scalar 
SDE, if Hy^e and cr^^ are the mean and standard deviation for given w^, then 



where Z is a unit Normal random variable. The expected payoff can then be 
expressed as 



Ve^E 



E [P{Xie) I w']] - j y P{^) Px^Jx I w') dxj p^e{y) dy. 



The outer expectation is an average over the discrete Brownian motion incre- 
ments, while the inner conditional expectation is averaging over Z. 

To compute the sensitivity to the input parameter 9, the first step is to ap- 
ply the pathwise sensitivity approach for fixed w'' to obtain d^yji/d9, d(7^i/d9. 
We then apply LRM to the inner conditional expectation to get 



d9 
where 



E 



d_ 
89 



E [P{Xi,) I w'] 



E 



E, 



d{\ogPx^] 

P{Xi,) 



d{\ogpxi,) 

2f_ 

d9 



09 



d9 



d{\ogPx^ ) dfi^i 9{\ogpxi ) da. 



89 



w 
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This leads to the estimator 



-E 



m=l 



' 36 
36 

3(T,„e,m. 



■ 3{\ogpx%) „ 

P[X^^,) 



3{\ogpx 



3a„ 



2* \yji,-m 



(29) 



t,m,i 



TXT XL) ^ ' 1 UJ^ - t ' f' _ , TTT'l T.^^ 7n ' 

We compute — — - — and — — -— with pathwise sensitivities. With XJ ' 
36 36 ^ 

X^fXw''''"^, Z^), we substitute the following estimators into (!29|) 



E 



E 



P (x,^0 
P [xi.) 



3{\ogpx 



3jj, 
3{\ogpx 



-21 



21 ~ f^w^'- 



3a,, 



i=l 

s 



a 



-21 



i=l 



f 



1 _^ \^^2' 



0" 



In a multilevel setting, at the fine level we can use fl29|) directly. At the 
coarse level, as for digital options in section \3.5\ the fine Brownian increments 
over the first half of the coarse timestep are re-used to derive ( 129|) . 

The numerical experiments for the call option with s = 10 was obtained 
[8j, with scalar Milstein scheme used to obtain the penultimate step. 



Estimator 


/3 


MLMC Complexity 


Value 


^ 2.0 




Delta 


^ 1.5 




Vega 


^ 2.0 





Although the discussion so far has considered an option based on the value 
of a single underlying value at the terminal time T, it can be shown that the 
idea extends very naturally to multidimensional cases, producing a conditional 
multivariate Gaussian distribution, and also to financial payoffs which are 
dependent on values at intermediate times. 



5 MLMC for Jump-difFusion processes 

Giles and Xia in 07] investigated the extension of the MLMC method to jump- 
diffusion SDEs. We consider models with finite rate activity using a jump- 
adapted discretisation in which the jump times are computed and added to the 
standard uniform discretisation times. If the Poisson jump rate is constant, 
the jump times are the same on both paths and the multilevel extension is 
relatively straightforward, but the implementation is more complex in the 
case of state-dependent jump rates for which the jump times naturally differ. 
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Merton[36] proposed a jump-diffusion process, in which the asset price 
follows a jump- diffusion SDE: 



dx{t) = f{x{t-))At + g{x{t-))Aw{t) + c{x{t-))AJ{t), 0<t<T, (30) 

where the jump term J{t) is a compound Poisson process ^^JiiYi — 1), the 
jump magnitude Yi has a prescribed distribution, and N(t) is a Poisson process 
with intensity A, independent of the Brownian motion. Due to the existence 
of jumps, the process is a cadlag process, i.e. having right continuity with 
left limits. We note that x{t—) denotes the left limit of the process while 
x{t) = lims_>t+ In [36], Merton also assumed that logl^ has a normal 
distribution. 

5.1 A Jump-adapted Milstein discretisation 

To simulate finite activity jump-diffusion processes, Giles and Xia used 



the jump-adapted approximation from Platen and Bruti-Liberat[40j. For each 




path simulation, the set of jump times J = {ti,T2,... ,Tm} within the time 
interval [0, T] is added to a uniform partition "Pa*, := {nAti : = 0, 1, 2, 2'}. 
A combined set of discretisation times is then given by T = {0 = to < < ^2 < 
. . . < tM = T} and we define a the length of the timestep as Atf = — t„. 
Clearly, At]" < AU. 

Within each timestep the scalar Milstein discretisation is used to approxi- 
mate the SDE fl30|) . and then the jump is simulated when the simulation time 
is equal to one of the jump times. This gives the following numerical method: 



where = is the left limit of the approximated path, Aw^ is the 

Brownian increment and Yi is the jump magnitude at Xj. 

5.1.1 Multilevel Monte Carlo for constant jump rate 

In the case of the jump-adapted discretisation the telescopic sum ^ is written 
down with respect to At^ rather than to At". Therefore, we have to define the 
computational complexity as the expected computational cost since different 
paths may have different numbers of jumps. However, the expected number of 
jumps is finite and therefore the cost bound in assumption iv) will still remain 
valid for an appropriate choice of the constant C3. 

The MLMC approach for a constant jump rate is straightforward. The 
jump times r^, which are the same for the coarse and fine paths, are simulated 
by setting Tj — Tj_i ~ exp(A). 





n+l ''"ii 



(31) 
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Pricing European call and Asian options in this setting is straightforward. 
For lookback, barrier and digital options we need to consider Brownian bridge 
interpolations as we did in Section 13.21 However, due to presence of jumps 
some small modifications are required. To improve convergence we will be 
looking at Brownian bridges between time-steps coming from jump- adapted 
discretization. In order to obtain an interpolated value X^^^ for the coarse 

time-step a Brownian Bridge interpolation over interval [kn, k^^ is considered, 
where 

kn = max {nAt'^_^, max {r G J : r < (ra + i)At"_i}} 

kn = min {(n + l)At^_^, min {r G J : r > (n + i)At^_ J}. 

Hence 

x'~\ =xi~' + \,_,ixl-'-xi-') 

+ g{Xi-^) {w\{n + i)At,_i) - w\kn) - (w^kn) - w'ikn))) 

where A^_i = {{n + ^)At'^_-^ - kn)/{kn - kn)- 

In the same way as in Section [3^ the minima over time- adapted discretiza- 
tion can be derived. For the fine time-step we have 

K^^n =\[x'n + i " \j [KI^ "^n) ^ " diX^Y At" log 

Notice the use of the left limits X^~. Following discussion in the previous 
sections, the minima for the coarse time-step can be derived using interpo- 
lated value X^~_^^ ■ Deriving the payoffs for lookback and barrier option is now 

straightforward. 

For digital options, due to jump-adapted time grid, in order to find condi- 
tional expectations, we need to look at relations between the last jump time 
and the last timestep before expiry. In fact, there are three cases: 

1. The last jump time r happens before penultimate fixed-time timestep, 
i.e. r < (2'-i - 2)Ati. 

2. The last jump time is within the last fixed-time timestep , 
i.e. r > (2'-i - l)At;; 

3. The last jump time is within the penultimate fixed-time timestep, 
i.e. (2'-^ - l)At, > r > (2'-^ - 2)Ati. 

With this in mind we can easily write down the payoffs for the coarse and fine 
approximations as we presented in Section 13.51 
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5.1.2 MLMC for Path-dependent rates 



In the case of a path-dependent jump rate X{x{t)), the implementation of the 
muhilevel method becomes more difficuh because the coarse and fine path 
approximations may have jumps at different times. These differences could 
lead to a large difference between the coarse and fine path payoffs, and hence 
greatly increase the variance of the multilevel correction. To avoid this, Giles 
and Xia [47j modified the simulation approach of Glasserman and Merener [26] 
which uses "thinning" to treat the case in which X{x(t),t) is bounded. Let 
us recall the thinning property of Poisson processes. Let {Nt)t>o be a Poisson 
process with intensity A and define a new process Zf by "thinning" Nf. take 
all the jump times (r„, n > 1) corresponding to N, keep then with probability 
< p < 1 or delete then with probability 1 —p, independently from each other. 
Now order the jump times that have not been deleted: (r„, n > 1), and define 

n>l 

Then the process Z is Poisson process with intensity pX. 

In our setting, first a Poisson process with a constant rate Asup (which 
is an upper bound of the state-dependent rate) is constructed. This gives a 
set of candidate jump times, and these are then selected as true jump times 
with probability A(x(t), t)/Asup- The following jump-adapted thinning Milstein 
scheme is obtained 

1. Generate the jump- adapted time grid for a Poisson process with constant 
rate Agup! 

2. Simulate each timestep using the Milstein discretisation; 

3. When the endpoint tn+i is a candidate jump time, generate a uniform 
random number U ~ [0, 1], and if f/ < Pt^+i = ^ ^ "t"^ — )_>_n+i)_^ then 

^sup 

accept tn+i as a real jump time and simulate the jump. 

In the multilevel implementation, the straightforward application of the 
above algorithm will result in different acceptance probabilities for fine and 
coarse level. There may be some samples in which a jump candidate is accepted 
for the fine path, but not for the coarse path, or vice versa. Because of the 
first order strong convergence, the difference in acceptance probabilities will 
be 0{Ati), and hence there is an 0{Ate) probability of coarse and fine paths 
differing in accepting candidate jumps. Such differences will give an 0{1) 
difference in the payoff value, and hence the multilevel variance will be 0{h). 
A more detailed analysis of this is given in |16] . 

To improve the variance convergence rate, a change of measure is used so 
that the acceptance probability is the same for both fine and coarse paths. 
This is achieved by taking the expectation with respect to a new measure Q: 

Ep[p/ - = eq[p/ n Rt - n 

T T 
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where r are the jump times. The acceptance probabihty for a candidate jump 
under the measure Q is defined to be | for both coarse and fine paths, instead 
of = A(X(r— ), r) / Asup- The corresponding Radon-Nikodym derivatives 
are 



Rl 



2pi 



if f/ < - 



2(1 iiU> 



Ri 



2(1 -p^), iiU> 



Since N[Rl - i?^] = ^(At^) and N[Pt - P^-i] = 0{At'^), this results in the 
multilevel correction variance Vq[P£ YIt R( ~ Pt-i Hr Rt\ being C(At^). 

If the analytic formulation is expressed using the same thinning and change 
of measure, the weak error can be decomposed into two terms as follows: 



P,\[R{-P\[Rr 



E, 



Q 



+ Ec 



Using Holder's inequality, the bound ma.x{RT-, Rl) < 2 and standard results 
for a Poisson process, the first term can be bounded using weak convergence 
results for the constant rate process, and the second term can be bounded 
using the corresponding strong convergence results [l6] . This guarantees that 
the multilevel procedure does converge to the correct value. 



5.2 Levy processes 

Dereich and Heidenreich [13] analysed approximation methods for both finite 
and infinite activity Levy driven SDEs with globally Lipschitz payoffs. They 
have derived upper bounds for MLMC variance for the class of path dependent 
payoffs that are Lipschitz continuous with respect to supremum norm. One 
of their main findings is that the rate of MLMC variance converges is closely 
related to Blumenthal-Getoor index of the driving Levy process that measures 
the frequency of small jumps. In [13] authors considered SDEs driven by the 
Levy process 

s{t) = Ew{t) + L{t) +bt, 

where S is the diffusion coefficient, L(t) is a compensated jump process and b 
is a drift coefficient. The simplest treatment is to neglect all the jumps with 
size smaller than h. To construct MLMC they took h^, that is at level i they 
neglected jumps smaller than h^. Then similarly as in the previous section, 
a uniform time discretization Ate augmented with jump times is used. Let 
us denote by AL{t) = L(t) — L{t)^ , the jump-discontinuity at time t. The 

crucial observation is that for h' > h > the jumps of the process L'^ can be 
obtained from those of L'^ by 

AL{t)^ = ^-^f l{|AL{t)ft|>/i'}' 
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this gives the necessary couphng to obtain a good MLMC variance. We define 
a decreasing and invertible function g : (0, oo) — )■ (0, oo) such that 




A li>{dx) < g{h) for all > 0, 



where z/ is a Levy measure, and for for £ G N we define 

At^ = 2"^ and = g-\2^). 

With this choice of Ati and h^, authors in [13} analysed the standard Euler- 
Maruyama scheme for Levy driven SDEs. This approach gives good results for 
a Blumenthal-Getoor index smaller than one. For a Blumenthal-Getoor index 
bigger than one, Gaussian approximation of small jumps gives better results 



6 Multi-dimensional Milstein scheme 

In the previous sections it was shown that by combining a numerical approx- 
imation with the strong order of convergence (9(At^) with MLMC results in 
reduction of the computational complexity to estimate expected values of func- 
tional of SDE solutions with a root-mean-square error of e from 0{e~^) to 
0{e~^). However, in general, to obtain a rate of strong convergence higher 
than 0(At^/^) requires simulation, or approximation, of Levy areas. Giles 
and Szpruch in [22j through the construction of a suitable antithetic multi- 
level correction estimator, showed that we can avoid the simulation of Levy 
areas and still achieve an O(At^) variance for smooth payoffs, and almost 
an 0(At^/^) variance for piecewise smooth payoffs, even though there is only 
Oi^At^/"^) strong convergence. 

In the previous sections we have shown that it can be better to use different 
estimators for the finer and coarser of the two levels being considered, P/ when 
level i is the finer level, and when level £ is the coarser level. In this case, 
we required that 

E[P/] = E[P;] for £ = 1, . . . , L, (33) 

so that 

L 

E[P[] = E[pf] + $^E[P/ - PtJ, 
e=i 

still holds. For lookback, barrier and digital options we showed that we can 
obtain a better MLMC variance by suitable modifying the estimator on the 
coarse levels. By further exploiting the fiexibility of MLMC, Giles and Szpruch 
[22] modified the estimator on the fine levels in order to avoid simulation of 
the Levy areas. 
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6.1 Antithetic MLMC estimator 



Based on the well-known method of antithetic variates (see for example [24]), 
the idea for the antithetic estimator is to exploit the flexibility of the more 
general MLMC estimator by defining Pi_i to be the usual payoff P(X^) coming 
from a level i—1 coarse simulation X'^, and defining P/ to be the average of the 
payoffs P(X-^),P(X°) coming from an antithetic pair of level i simulations, 
and X". 

X-^ will be defined in a way which corresponds naturally to the construction 
of X'^. Its antithetic "twin" X" will be defined so that it has exactly the 
same distribution as X-l" , conditional on X'^, which ensures that E[P(X''^)] = 
E[P(X")] and hence (|3]) is satisfied, but at the same time 



and therefore 



(X-^-X") ^ -(X^-X'^) 



(P(X^) - P(X'=)) ^ - (P(X") - P(X'=)) 



so that I (P(X^) + P(X")) ^ P(X^). This leads to i (P(X^) + P(X'*)) - 
P(X^) having a much smaller variance than the standard estimator P{X^) — 

p\x% 

We now present a lemma which gives an upper bound on the convergence 
of the variance of \ {P{Xf) + P(X«)) - P(X'=). 



Lemma 6.1. If P e C^i 

all xeW^ 



and there exist constants Li,L2 such that for 



dP 



dx 



d^p 



dx^ 



then for p >2, 

E [(i(P(XO + P(X»)) - P(X^))T 

< E [||i(X^+X") -X"f] + 2^^^^+^)^^ E [||X^-X"||''' 



In the multidimensional SDE applications considered in finance, the Mil- 
stein approximation with the Levy areas set to zero, combined with the an- 
tithetic construction, leads to X^ - X" = 0{At^/'^) but X^ - X^ = 0(At). 
Hence, the variance V[i(P/ + P;'') - Pf_i] is 0{At^), which is the order ob- 
tained for scalar SDEs using the Milstein discretisation with its first order 
strong convergence. 



6.2 Clark- Cameron Example 

The paper of Clark and Cameron |9j addresses the question of how accurately 
one can approximate the solution of an SDE driven by an underlying multi- 
dimensional Brownian motion, using only uniformly-spaced discrete Brownian 
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increments. Their model problem is 

dxi(t) = dwi(t) 

dX2{t) = Xi{t)dW2{t), (34) 

with x(0) = y{0) = 0, and zero correlation between the two Brownian motions 
wi{t) and W2{t). These equations can be integrated exactly over a time interval 
[t„,t„+i], where tn = n At, to give 

Xl{tn+l) = Xi{tn) + Awi^n 

X2{tn+l) = X2(tn) + Xi(t„)Aw2,„ + iAwi,„Aw2,n + |^12,n (35) 

where Awi^n = 'U^i(^n+i) — Wi{tn), and Ai2,„ is the Levy area defined as 

rtn + l rtn + 1 

Al2,n= I {Wi{t)~Wi{tn))<lW2{t) - / (^2 (t) - ^2 (^n) ) dWi (t) . 



This corresponds exactly to the Milstein discretisation presented in ((Tj), so for 
this simple model problem the Milstein discretisation is exact. 

The point of Clark and Cameron's paper is that for any numerical approx- 
imation X{T) based solely on the set of discrete Brownian increments Aw, 

n{x2{T) - X2{T)f] > \TAt. 

Since in this section we use superscript /, a,c for fine X'^, antithetic X" 
and coarse approximations, respectively, we drop the superscript i for the 
clarity of notation. 

We define a coarse path approximation X'^ with timestep At by neglecting 
the Levy area terms to give 

Xln^i = X£„ + Xr,„Ai/;^-Vi + i A<„Vi (36) 

This is equivalent to replacing the true Brownian path by a piecewise linear 
approximation as illustrated in Figure [H 

Similarly, we define the corresponding two half-timesteps of the first fine 
path approximation X-^ by 

xL+1 = X^^^^,+Xl^_^,Awl^^, + lAwl^^,Awl^+, 

where Aw^^^i = Aw^_^i + Aw^^_p Using this relation, the equations for the 
two fine timesteps can be combined to give an equation for the increment over 
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the coarse timestep, 

^l,n+l ~ ^l,n "T ^"^l.n+l 



= X{„ + X{„ A«;^T„Vi + I At^f-Vi A<„Vi (37) 



The antithetic approximation is defined by exactly the same discreti- 
sation except that the Brownian increments 6wn and are swapped, as 
illustrated in Figure [1] This gives 

X,Vi = X«„ + A<„+,, 



and hence 

^l,n+l ~ ^l,n T ^"^l.n+l) 

^2%+! = ^£n + ^l,nA<„Vi + iA^-VlA<„Vl (38) 

Swapping Aw^ , and Aw^.^ does not change the distribution of the driving 

Brownian increments, and hence X" has exactly the same distribution as X-^. 
Note also the change in sign in the last term in (P7|) compared to the corre- 
sponding term in f p8|) . This is important because these two terms cancel when 
the two equations are averaged. 




Figure 1: Brownian path and approximations over one coarse timestep 
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These last terms correspond to the Levy areas for the fine path and the 
antithetic path, and the sign reversal is a particular instance of a more general 
result for time-reversed Brownian motion, [30]. If {wt,0 < t < 1) denotes a 
Brownian motion on the time interval [0, 1] then the time- reversed Brownian 
motion {zt,0 < t < 1) defined by 



Zt 



(39) 



has exactly the same distribution, and it can be shown that its Levy area is 
equal in magnitude and opposite in sign to that of Wt- 

Lemma 6.2. IfX^, and are as defined above, then 



and 



E 



N 



T(T+At) At^ 



In the next section we will see how this lemma generalizes to non-linear 
multidimensional SDEs (HI). 



6.3 Milstein discretisation - General theory 

Using the coarse timestep At, the coarse path approximation X^, is given by 
the Milstein approximation without the Levy area term, 

m 

m 
j,k=l 

The first fine path approximation X^ (that corresponds to X^) uses the 
corresponding discretisation with timestep At/2, 

m 

= Xl^ + f,{Xi)At,_j2 + Y.g,,{Xi)Aw' (40) 

m 

+ E ^^^-^-(^n) (^<n+l M,n4-1 " ^t^^l/^) , 

j,k=l 

m 

Hn^l = Kn^+MK^)^'^-^/^ + Y.9^^(K^)^<n^l (41) 

m 

j,k=i 
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where Awi^\ = Aw^_^i + Awi^^. 

The antithetic approximation is defined by exactly the same discreti- 
sation except that the Brownian increments Dw^^i and Aw^_^_i are swapped, 
so that 



+ h^^^^^n) (A<n+1 " At,-l/2) 



(42) 



It can be shown that ^22] 

Lemma 6.3. For a// integers p >2, there exists a constant Kp such that 



E 



max ||X^-X^f 



< K„AtP/^ 



Let's denote the average fine and antithetic path as follows 

The main results of [22] is the following theorem: 

Theorem 6.4. For all p > 2, there exists a constant Kp such that 



E 



max \\X„ 

0<n<N 



X. 



cup 



< KpAtP. 



This together with a classical strong convergence result for Milstein dis- 
cretization allows to estimate the MLMC variance for smooth payoffs, n the 
case of payoff which is a smooth function of the final state x{T), taking p = 2 
in Lemma I6.H p = 4 in Lemma 16.31 and p = 2 in Theorem 16. 4[ immediately 
gives the result that the multilevel variance 



V 



i (P(X^) + P(X^)) - P(X^) 



has an O(At^) upper bound. This matches the convergence rate for the mul- 
tilevel method for scalar SDEs using the standard first order Milstein discreti- 
sation, and is much better than the 0{At) convergence obtained with the 
Euler-Maruyama discretisation. 



34 



However, very few financial payoff functions are twice differentiable on tlie 
entire domain M'^. A more typical 2D example is a call option based on the 
minimum of two assets, 

P{x{T)) = max(0,min(xi(T),a;2(T)) -ir) , 

which is piecewise linear, with a discontinuity in the gradient along the three 
lines (s, K), {K, s) and (s, s) for s > K. 

To handle such payoffs, an assumption which bounds the probability of 
the solution of the SDE having a value at time T close to such lines with 
discontinuous gradients is needed. 

Assumption 6.5. The payoff function P G C(M°',M) has a uniform Lipschitz 
hound, so that there exists a constant L such that 

\P{x) - P{y)\ < L \x - y\ , Vx,yGM'^, 

and the first and second derivatives exist, are continuous and have uniform 
hound L at all points x ^ K , where K is a set of zero measure, and there 
exists a constant c such that the prohahility of the SDE solution x{T) heing 
within a neighhourhood of the set K has the hound 

P ( min||x(r) - y|| < e ) < c e, > 0. 
yy^K J 

In a ID context. Assumption 16.51 corresponds to an assumption of a locally 
bounded density for x{T). 

Giles and Szpruch in [22] proved the following result 

Theorem 6.6. If the payoff satisfies Assumption \6.5\ then 
for any 5 > 0. 



-5\ 



6.4 Piecewise linear interpolation analysis 

The piecewise linear interpolant X^{t) for the coarse path is defined within 
the coarse timestep interval [t^j^fc+i] as 

X\t)^{l-\)Xl + \Xl^,, \^±L^. 

Likewise, the piecewise linear interpolants X^it) and X°'{t) are defined on the 
fine timestep [tfe,tfc4.i] as 



Xf{t)^{l-\)Xl + \Xf X\t)^{l-\)Xl + \Xl^,, A = 



2 



and there is a corresponding definition for the fine timestep [t^^i , t^+i]. It can 
be shown that [22] 
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Theorem 6.7. For all p > 2, there exists a constant Kp such that 

sup E[\\Xf{t)-X%t)r] <KpAt^/\ 



0<t<T 



sup E 
o<t<r 



X'{t)-X''{t) <KpAtP, 



where X (t) is the average of the piecewise linear interpolants X^ (t) and X°-{t). 



For an Asian option, the payoff depends on the average 

Jo 



Xave = I X(t)dt. 



This can be approximated by integrating the appropriate piecewise hnear in- 
terpolant which gives 



N-l 



•^0 n=0 

T N-l 

XL ^ Xf{t)dt = N-'J2-,iXl + 2Xl^, + Xl^,), 

•^0 n=0 ' 

N-l 



XI, = f X'^{t)dt = Ar-i^i(X^ + 2X«^i+X„V). 

•^0 n=0 

Due to Holder's inequahty, 

E[||^L-^:.eir] < rE[\\Xf{t)-X%t)f]dt 

Jo 



< supE[||x/(t)-x«(t)in, 

[0,T] 



and similarly 



E [ U{XL+X:J - Xm'] < supE X^{t) - X\t) 

[0,T] L 

Hence, if the Asian payoff is a smooth function of the average, then we obtain 
a second order bound for the multilevel correction variance. 

This analysis can be extended to include payoffs which are a smooth func- 
tion of a number of intermediate variables, each of which is a linear functional 
of the path x{t) of the form 

T 

T 



g'{t) x(t) fx(dt), 

for some vector function g(t) and measure yu(dt). This includes weighted av- 
erages of x{t) at a number of discrete times, as well as continuously-weighted 
averages over the whole time interval. 
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As with the European options, the analysis can also be extended to payoffs 
which are Lipschitz functions of the average, and have first and second deriva- 
tives which exist, and are continuous and uniformly bounded, except for a set 
of points K of zero measure. 

Assumption 6.8. The payoff P G C(]R'^,]R) has a uniform Lipschitz hound, 
so that there exists a constant L such that 



\P{x) — -P(y)| < L \x — y\ , ^x,y G 



nd 



and the first and second derivatives exist, are continuous and have uniform 
bound L at all points x ^ K, where K is a set of zero measure, and there exists 
a constant c such that the probability of Xave being within a neighbourhood of 
the set K has the bound 

P I min \\xave — yll < £^ 1 < c £, Ve > 0. 



Theorem 6.9. If the payoff satisfies Assumption \6.8[ then 

E [iUPiXU + PiX:J) - P{X:J)'] = o{At'/'- 
for any S > 0. 

We refer the reader to p2] for more details. 



6.5 Simulations for antithetic Monte Carlo 

Here we present numerical simulations for a European option for process 
simulated by , X*^ and X'^ defined in section 16.21 with initial conditions 
Xi(0) = X2(0) = 1 . The results in Figure [2] are for a European call op- 
tion with terminal time 1 and strike K = 1, that is P = {x{T) — K)^ 
The top left left plot shows the behaviour of the variance of both and 
Pf, — Pi-i- The superimposed reference slope with rate 1.5 indicates that 
the variance = W[Pe — Pe-i] = O^Atj-^), corresponding to 0(e~^) com- 
putational complexity of antithetic MLMC The top right plot shows that 
K[Pi — Pe-i] = 0{Ate). The bottom left plot shows the computational com- 
plexity C (as defined in Theorem 12.11) with desired accuracy e. The plot is 
of C versus e, because we expect to see that C is only weakly depen- 
dent on e for MLMC. For standard Monte Carlo, theory predicts that C 
should be proportional to the number of timesteps on the finest level, which 
in turn is roughly proportional to due to the weak convergence order. For 
accuracy e = 10~^, the antithetic MLMC is approximately 500 times more 
efficient than the standard Monte Carlo. The bottom right plot shows that 
Y[Xi,e — = 0{Ati). This corresponds to standard strong convergence 
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Figure 2: Call option. 



of order 0.5. We have also tested the algorithm presented in [22] for approx- 
imation of Asian options. Our results were almost identical as for European 
options. In order to treat the lookback, digital and barrier options we found 
that a suitable antithetic approximation to the Levy areas are needed. For 
suitable modification of the antithetic MLMC estimator we performed numer- 
ical experiments where we obtained 0(e~^ log(e)^) complexity for estimating 
barrier, digital and lookback options. Currently, we are working on theoretical 
justification of our results. 



7 Other uses of multilevel method 



7.1 SPDEs 

Multilevel method has been used for a number of parabolic and elliptic SPDE 
applications [H |TOl |27] but the first use for a financial SPDE is in a new paper 
by Giles & Reisinger pT] . 

This paper considers an unusual SPDE which results from modelling credit 
default probabilities. 
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subject to boundary condition p(0, t) =0. Here p{x, t) represents the probabil- 
ity density function for firms being a distance x from default at time t. The 
diffusive term is due to idiosyncratic factors affecting individual firms, while 
the stochastic term due to the scalar Brownian motion Mj corresponds to the 
systemic movement due to random market effects affecting all firms. 

Using a Milstein time discretisation with uniform timestep fc, and a central 
space discretisation of the spatial derivatives with uniform spacing h gives the 
numerical approximation 

+ """'at""''" fei - 2P" + P?-0 . (44) 

where the Z„ are standard Normal random variables so that \/hZn corresponds 
to an increment of the driving scalar Brownian motion. 

The paper shows that the requirment for mean-square stability as the grid 
is refined and /c, /i — )■ is k/h'^ < (1 + 2p^)~^, and in addition the accuracy 
is Olkjh"^). Because of this, the multilevel treatment considers a sequence of 
grids with hi = 2 h£_i, ki = 4 ki_i. 

The multilevel implementation is very straightforward, with the Brownian 
increments for the fine path being summed pairwise to give the corresponding 
Brownian increments for the coarse path. The payoff corresponds to different 
tranches of a credit derivative that depends on a numerical approximation of 
the integral 

poo 

p{x, t) Ax. 







The computational cost increases by factor 8 on each level, and numerical 
experiments indicate that the variance decreases by factor 16. The MLMC 
Theorem still applies in this case, with /3 = 4 and 7 = 3, and so the overall 
computational complexity to achieve an 0(e) RMS error is again (9(e"^). 



7.2 Nested simulation 

The pricing of American options is one of the big challenges for Monte Carlo 
methods in computational finance, and Belomestny & Schoenmakers have re- 
cently written a very interesting paper on the use of multilevel Monte Carlo for 
this purpose [5]. Their method is based on Anderson & Broadie's dual simula- 
tion method [1] in which a key component at each timestep in the simulation 
is to estimate a conditional expectation using a number of sub-paths. 

In their multilevel treatment, Belomestny & Schoenmakers use the same 
uniform timestep on all levels of the simulation. The quantity which changes 
between different levels of simulation is the number of sub-samples used to 
estimate the conditional expectation. 
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To couple the coarse and fine levels, the fine level uses Ni sub-samples, 
and the coarse level uses Ni_i = Ni/2 of them. Similar research by N. Chen [J 
found the multilevel correction variance is reduced if the payoff on the coarse 
level is replaced by an average of the payoffs obtained using the first Ni/2 
and the second Ni/2 samples. This is similar in some ways to the antithetic 
approach described in section [61 

In future research, Belomestny & Schoenmakers intend to also change the 
number of timesteps on each level, to increase the overall computational ben- 
efits of the multilevel approach. 



7.3 Truncated series expansions 

Building on earlier work by Broadie and Kaya [7j, Glasserman and Kim have 
recently developed an efficient method [25j of exactly simulating the Heston 
stochastic volatility model [29] . 

The key to their algorithm is a method of representing the integrated 
volatility over a time interval [0, T], conditional on the initial and final values, 
vq and vt as 

(n-p s OO OO OO 

/ Vsds Vo = Vo,Vt = VtJ = ^Xn + ^Vn + ^Zn 
/ „=l n=l n=l 

where Xn,yn,Zn are independent random variables. 

In practice, they truncate the series expansions at a level which ensures the 
desired accuracy, but a more severe truncation would lead to a tradeoff between 
accuracy and computational cost. This makes the algorithm a candidate for a 
multilevel treatment in which level i computation performs the truncation at 
A^^ (taken to be the same for all three series, for simplicity). 

To give more details, the level i computation would use 

Ne Ni Ni 

n=l n=l n=l 

while the level i—1 computation would use 

Aff_i Ne-i Ne_i 

n=l n=l n=l 

with the same random variables Xn,yn,Zn- 

This kind of multilevel treatment has not been tested experimentally, but it 
seems that it might yield some computational savings even though Glasserman 
and Kim typically retain only 10 terms in their summations through the use 
of a carefully constructed estimator for the truncated remainder. In other 
circumstances requiring more terms to be retained, the savings may be larger. 



^unpublished, but presented at the MCQMC12 conference. 
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7.4 Mixed precision arithmetic 

The final example of the use of multilevel is unusual, because it concerns the 
computer implementation of Monte Carlo algorithms. 

In the latest CPUs from Intel and AMD, each core has a vector unit which 
can perform 8 single precision or 4 double precision operations with one in- 
struction. Together with the obvious fact that double precision variables are 
twice as big as single pecision variables and so require twice as much time to 
transfer, in bulk, it leads to single precision computations being twice as fast 
as double precision computations. On CPUs (graphics processors) the differ- 
ence in performance can be even larger, up to a factor of eight in the most 
extreme cases. 

This raises the question of whether single precision arithmetic is sufficient 
for Monte Carlo simulation. In general, our view is that the errors due to 
single precision arithmetic are much smaller than the errors due to 

• statistical error due to Monte Carlo sampling; 

• bias due to SDE discretisation; 

• model uncertainty. 

We have just two concerns with single precision accuracy: 

• there can be significant errors when averaging the payoffs unless one uses 
binary tree summation [?] to perform the summation; 

• when computing Greeks using "bumping", the single precision inaccu- 
racy can be greatly amplified if a small bump is used. 

Our advice would be to always use double precision for the final accumu- 
lation of payoff values, and pathwise sensitivity analysis as much as possible 
for computing Greeks, but if there remains a need for the path simulation 
to be performed in double precision then one could use two-level approach in 
which level corresponds to single precision and level 1 corresponds to double 
precison. 

On both levels one would use the same random numbers. The multilevel 
analysis would then give the optimal allocation of effort between the single 
precision and double precision computations. Since it is likely that most of 
the calculations would be single precision, the computational savings would 
be a factor two or more compared to standard double precision calculations. 

8 Multilevel Quasi-Monte Carlo 

In Theorem 12.11 if /3 > 7, so that rate at which the multilevel variance decays 
with increasing grid level is greater than the rate at which the computational 
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cost increases, then the dominant computational cost is on the coarsest levels 
of approximation. 

Since coarse levels of approximation correspond to low- dimensional numer- 
ical quadrature, it is quite natural to consider the use of quasi-Monte Carlo 
techniques. This has been investigated by Giles & Waterhouse [23j in the con- 
text of scalar SDEs with a Lipschitz payoff. Using the Milstein approximation 
with a doubling of the number of timesteps on each level gives (3 = 2 and 
7 = 1. They used a rank-1 lattice rule to generate the quasi-random num- 
bers, randomisation with 32 independent offsets to obtain confidence intervals, 
and a standard Brownian Bridge construction of the increments of the driving 
Brownian process. 

Their empirical observation was that MLMC on its own was better than 
QMC on its own, but the combination of the two was even better. The QMC 
treatment greatly reduced the variance per sample for the coarsest levels, re- 
sulting in significantly reduced costs overall. In the simplest case of a European 
call option, shown in Figure |3l the top left plot shows the reduction in the vari- 
ance per sample as the number of QMC points is increased. The benefit is 
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much greater on the coarsest levels than on the finest levels. In the bottom two 
plots, the number of QMC points on each level is determined automatically to 
obtain the required accuracy; see [23] for the precise details. Overall, the com- 
putational complexity appears to be reduced from 0{e~^) to approximately 



Giles & Waterhouse interpreted the fact that the variance is not reduced on 
the finest levels as being due to a lack of significant low-dimensional content, 
i.e. the difference in the two payoffs due to neighbouring grid levels is due to 
the difference in resolution of the driving Brownian path, and this is inherently 
of high dimensionality. This suggests that in other applications with /3 < 7, 
which would lead to the dominant cost being on the finest levels, then the use 
of quasi- Monte Carlo methods is unlikely to yield any benefits. 

Further research is needed in this area to investigate the use of other low- 
discrepancy sequences (e.g. Sobol) and other ways of generating the Brownian 
increments (e.g. PCA). We also refer the reader to ^13] for some results for 
randomized multilevel quasi-Monte Carlo. 

9 Conclusions 

In the past 6 years, considerable progress has been achieved with the multilevel 
Monte Carlo method for financial options based on underlying assets described 
by Brownian diffusions, jump diffusions, and more general Levy processes. 

The multilevel approach is conceptually very simple. In essence it is a 
recursive control variate strategy, using a coarse path simulation as a control 
variate for a fine path simulation, relying on strong convergence properties to 
ensure a very strong correlation between the two. 

In practice, the challenge is to couple the coarse and fine path simulations 
as tightly as possible, minimising the difference in the payoffs obtained for 
each. In doing this, there is considerable freedom to be creative, as shown in 
the use of Brownian Bridge constructions to improve the variance for lookback 
and barrier options, and in the antithetic estimators for multi-dimensinal SDEs 
which would require the simulation of Levy areas to achieve first order strong 
convergence. Another challenge is avoiding large payoff differences due to 
discontinuous payoffs; here one can often use either conditional expectations 
to smooth the payoff, or a change of measure to ensure that the coarse and 
fine paths are on the same side of the discontinuity. 

Overall, multilevel methods are being used for an increasingly wide range 
of apphcations. This biggest savings are in situations in which the coarsest 
approximation is very much cheaper than the finest. If the finest level of 
approaximation has only 32 timesteps, then there are very limited savings to 
be achieved, but if the finest level has 256 timesteps, then the potential savings 
are much larger. 

Looking to the future, exciting areas for further research include: 
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• more research on multilevel techniques for American and Bermudan op- 
tions; 

• more investigation of multilevel Quasi Monte Carlo methods; 

• use of multilevel ideas for completely new financial applications, such as 
Gaussian copula and new SPDE models. 
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